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1.  INTRODUCTION 


This  study  was  a  continuation  of  the  applied  research  project  reported  by  Norquist  and 
Balasubramaniam  [1],  We  continued  our  investigation  of  the  utility  of  optical  observations  of  the 
solar  chromosphere  in  the  diagnosis  of  flare  probability.  Because  we  felt  we  were  not  ready  to 
project  the  flare  probability  estimate  ahead  in  time,  we  stayed  with  our  focus  on  inferring  flaring 
likelihood  at  the  observation  time.  As  in  the  previous  study,  we  used  observed  hydrogen-alpha 
(Ha)  intensity  from  the  U.  S.  Air  Force  Improved  Solar  Observing  Optical  Network  (ISOON) 
telescope  at  Sacramento  Peak,  NM  (Neidig  et  al.  [2]).  Sequences  of  Ha  images  at  one-minute 
intervals  and  one  arc  second  grid  spacing  for  selected  sub-regions  of  solar  active  regions  over  5- 
10  hour  periods  comprised  the  data  set.  We  performed  principal  component  analysis  (PCA)  on 
the  sequences  to  derive  the  eigenvectors  and  associated  eigenvalues.  Sub-region  average  Ha 
intensity  and  whole  disk  1-8  A  x-ray  flux  from  the  NOAA  Geostationary  Operational 
Environmental  Satellite  (GOES)  determined  a  flaring  degree  category  at  each  image  time.  A 
subset  of  leading  eigenvector  elements  at  each  time  served  as  the  predictors  and  flaring  category 
constituted  the  predictand  in  employing  multivariate  discriminant  analysis  (MVDA).  We 
invoked  MVDA  on  selected  image  sequences  making  up  a  “development  set.”  We  then  applied 
resulting  discriminant  vectors  to  the  eigenvector  elements  of  “application  set”  sequences  to 
determine  flaring  category  probability  at  observation  times.  Comparison  of  diagnosed  probability 
with  specified  flaring  category  over  all  application  times  determined  the  diagnosis  skill. 

We  began  the  new  period  of  study  by  adding  ISOON  image  sequences  so  that  more  would  be 
available  to  the  flare  diagnosis  development  and  application  algorithms.  By  acquiring  an 
additional  44  image  sequences  for  specific  date-active  region  sub-regions,  we  expanded  our 
available  pool  of  images  sequences  to  90.  In  the  previous  study,  we  found  that  the  image 
sequences  could  be  partitioned  by  flaring  level  indicator  (FLI)  according  to  the  temporal  pattern 
of  the  leading  eigenvectors  and  the  corresponding  x-ray  flux  rise  associated  with  any  flare 
present  in  the  sequence.  Norquist  and  Balasubramaniam  [1]  predetermined  the  FLI  for  each 
ISOON  sequence  based  on  a  subjective  assessment  of  its  Ha  eigenvector  patterns  and  the 
associated  x-ray  flux  peak  value.  We  continued  that  approach  to  determine  the  FLI  for  the  new 
sequences  added  in  the  present  study  period.  Briefly,  the  four  FLI  values  were  described  as 
follows:  FLI  =  0  for  no  flare  above  x-ray  background  and  smoothly  varying  (sinusoidal) 
eigenvectors;  FLI  =  1  for  weak  flares  (peak  flux  in  the  x-ray  decade  of  the  background  value) 
with  spiked  otherwise  smoothly  varying  (sinusoidal)  eigenvectors;  FLI  =  2  for  moderate  x-ray 
flares  (one  decade  greater  than  background)  and  smoothly  curved  (non-sinusoidal)  eigenvectors 
before  and  after  the  flare  spike;  FLI  =  3  for  strong  x-ray  flares  (two  or  more  decades  above 
background)  with  non-curving  eigenvectors  before  and  smoothly  curving  after  sharp  spikes.  In 
FLI  =  1-3,  the  eigenvector  spike  occurs  at  the  same  time  as  the  sharp  rise  in  the  x-ray  flux.  There 
must  be  a  simultaneous  eigenvector  spike  and  a  clear  x-ray  flux  rise  in  order  for  a  non-zero  FLI 
category  to  be  assigned  to  an  image  sequence.  For  reference,  see  Figure  7  of  Norquist  and 
Balasubramaniam  [1]  for  examples  of  each  FLI  category.  In  Table  1  we  list  the  90  image 
sequences  distributed  into  three  image  sequence  sets  (ISS)  that  have  an  approximately  equal 
number  of  each  of  the  FLI  categories.  In  Table  1,  Date  is  given  in  YYYYMMDD  format,  AR  # 
is  NOAA  active  region  number  (if  assigned  to  the  active  region  at  time  of  observation),  and  FLI 
is  prescribed  flaring  level  indicator.  Combinations  of  these  three  ISS  served  as  development  and 
application  sets  in  the  present  study. 
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Table  1.  ISOON  Ha  Image  Sequence  Sets  (ISS)  Used  In  This  Study 


ISS  1 

ISS  2 

ISS  3 

Date 

AR# 

FLI 

Date 

AR# 

FLI 

Date 

AR# 

FLI 

20021213 

10213 

1 

20021213 

10215 

1 

20021213 

10220 

1 

20021213 

10223 

1 

20030102 

10234 

0 

20030102 

10239 

0 

20021213 

10224 

0 

20030117 

10254 

0 

20030117 

10255 

0 

20021219 

10229 

3 

20030117 

10257 

0 

20030117 

10259 

0 

20030117 

10250 

0 

20030117 

10258 

1 

20030203 

10274 

0 

20030117 

10256 

0 

20030203 

10272 

0 

20030206 

1 

20030117 

10260 

0 

20030318 

10318 

0 

20030318 

10319 

0 

20030203 

10276 

0 

20030331 

10321 

0 

20030331 

10324 

0 

20030318 

10323 

0 

20030331 

10326 

1 

20030401 

10318 

1 

20030331 

10323 

1 

20030401 

10321 

0 

20030401 

10325 

0 

20030331 

10325 

0 

20030401 

10323 

1 

20030509 

1 

20030401 

10319 

1 

20030513 

10358 

0 

20030516 

10357 

0 

20030401 

10326 

0 

20030522 

10362 

1 

20030528 

10365 

1 

20030516 

10356 

1 

20030605 

10373 

0 

20030605 

10375 

0 

20030528 

10368 

1 

20030606 

0 

20030606 

10377 

1 

20030604 

10373 

0 

20030606 

10375 

1 

20030610 

10375 

0 

20030606 

10373 

0 

20030610 

10380 

0 

20030611 

10375 

2 

20030609 

10375 

1 

20030611 

10375 

2 

20030611 

10377 

0 

20030610 

10377 

0 

20030612 

10375 

1 

20030612 

10380 

0 

20030611 

10380 

2 

20030612 

10377 

0 

20030613 

10380 

1 

20030611 

10381 

0 

20030613 

10377 

0 

20030616 

10380 

0 

20030612 

10381 

0 

20030620 

10385 

0 

20030620 

10387 

0 

20030616 

10385 

0 

20030623 

10386 

0 

20030623 

10387 

0 

20030620 

10386 

1 

20030623 

10397 

0 

20030624 

10386 

0 

20030620 

10388 

0 

20030624 

10387 

1 

20031031 

10488 

1 

20030623 

10388 

0 

20030625 

10391 

0 

20031104 

10486 

3 

20030624 

10390 

0 

20031029 

10486 

3 

20040105 

1 

20031104 

10486 

1 

20031104 

10486 

1 

20040316 

0 

20041007 

0 

20041109 

10696 

2 

20050506 

10758 

2 

20050513 

10759 

3 

20050909 

10808 

3 

20061206 

10930 

3 

In  the  balance  of  this  report,  we  describe  several  methods  involving  four-category  MVDA  that 
used  predictor  vectors  and  prescribed  predictands  derived  from  Ha  intensity  and  x-ray  flux  data 
at  individual  image  times  of  the  image  sequences  in  Table  1.  In  Section  2,  we  briefly  review  the 
best  performing  version  of  the  flare  probability  diagnosis  development  and  application 
algorithms  from  the  previous  study.  In  Section  3,  we  describe  modifications  to  that  algorithm 
pair  made  to  attempt  to  improve  discrimination  among  the  flaring  categories.  In  Section  4  we 
relate  the  results  of  the  evaluation  of  the  two  image  time  flare  probability  diagnosis  methods.  In 
Section  5  we  describe  a  technique  to  diagnose  flaring  for  each  image  sequence  as  a  whole,  and 
present  the  results  of  its  assessment.  In  Section  6  we  discuss  the  modification  of  the  whole  - 
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sequence  method  and  present  its  results.  In  Section  7  we  relate  our  preliminary  consideration  of 
using  Doppler  velocity  data  from  ISOON  as  a  possible  supplement  to  Ha  imagery  in  diagnosing 
flares.  Section  8  ends  with  a  discussion  of  our  conclusions  from  the  study. 

2.  IMAGE  TIME  FLARE  PROBABILITY  DIAGNOSIS  METHOD  1 

After  acquiring  the  new  ISOON  Ha  image  sequences  and  partitioning  them  into  the  ISS  as  listed 
in  Table  1,  we  tried  them  out  on  the  legacy  flare  probability  diagnosis  development  and 
application  algorithms  from  the  previous  study.  Norquist  and  Balasubramaniam  [1]  refer  to  this 
version  as  the  Ha  Eigenvector  Flare  Categorization  (HEFC)  algorithms.  In  this  report  we  will 
refer  to  it  as  the  Image  Time  Flare  Probability  Diagnosis  Method  1.  See  Norquist  and 
Balasubramaniam  [1]  for  a  detailed  description  of  the  algorithm. 

For  our  purposes  in  this  report,  we  summarize  the  description  as  follows.  The  HEFC  (hereafter, 
Method  1)  development  algorithm  used  the  eigenvalues  corresponding  to  each  generated 
eigenvector  to  determine  the  number  of  eigenvectors  to  use  as  predictors.  We  used  the  number  of 
eigenvectors  corresponding  to  the  greatest  number  of  eigenvalues  over  all  sequences  in  the 
development  set  that  assured  that  99.9%  of  the  variance  was  explained  in  any  sequence.  This 
resulted  in  the  use  of  the  leading  25-50  eigenvectors  depending  on  the  development  set  used.  The 
elements  of  the  set  of  eigenvectors  used  were  the  predictor  vector  elements  at  each  image  time  of 
the  sequence. 

Most  of  the  experimentation  with  the  flare  diagnosis  algorithms  described  by  Norquist  and 
Balasubramaniam  [1]  was  focused  on  specifying  the  predictand  at  each  image  time.  In  this  report 
we  refer  to  the  specified  image  time  predictand  as  the  “observed”  flaring  category.  In  the  Method 
1  development  algorithm,  we  used  the  sub-region  average  Ha  intensity  of  the  date-active  region 
from  which  the  sequence  data  was  taken  to  determine  the  image  times  of  the  flare  rise.  The 
prescribed  non-zero  flaring  level  indicator  (FLI)  for  that  image  sequence  was  assigned  as  the 
predictand  at  those  times,  which  were  less  than  10  percent  of  all  image  times  in  almost  all  FLI  1- 
3  sequences.  At  all  other  image  times  in  the  sequence  and  for  all  times  in  FLI  0  (non-flaring) 
image  sequences,  the  predictand  was  set  to  zero.  As  mentioned  in  Section  1,  a  spike  in  the  Ha 
eigenvectors  coincident  with  x-ray  flux  rise  assured  that  the  flare  had  indeed  occurred  in  the  sub- 
region  of  the  subject  date-active  region.  Since  we  were  using  sub-region  average  Ha  intensity  in 
the  HEFC  algorithm,  we  verified  that  its  rise  coincided  in  time  with  the  x-ray  flux  rise  used  with 
the  eigenvector  patterns  to  set  the  FLI.  In  the  development  algorithm,  we  checked  the  flare  peak 
x-ray  flux,  and  maintained  zero  predictands  in  the  flare  rise  if  it  was  less  than  the  background 
value  computed  from  the  prior  day’s  x-ray  flux  time  series.  Background  x-ray  flux  was 
determined  using  the  NOAA  Space  Weather  Prediction  Center  “X-ray  Bkgd  Flux”  method  (see 
http://www.swpc.noaa.gov/wwire.html). 

This  process  of  setting  the  predictor  vector  and  predictand  for  each  image  time  was  repeated  for 
all  image  sequences  in  the  development  set.  All  such  specified  predictor  vector-predictand  pairs 
for  all  image  sequences  in  the  development  set  were  input  into  the  MVDA  routine,  which  was 
Fisher’s  Linear  Discriminant  for  Multiple  Groups  (see  Appendix  B  of  Norquist  and 
Balasubramaniam  [1],  or  Wilks  [3]).  Here,  the  “groups”  are  the  four  FLI  categories.  The  MVDA 
routine  produced  discriminant  vectors  with  the  same  number  of  elements  as  the  predictor  vectors. 
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For  the  four-group  MVDA,  there  were  three  discriminant  vectors,  and  four  group-mean  vectors 
whose  elements  were  the  means  of  the  eigenvector  elements  over  all  predictor  vectors  associated 
with  each  flare  category.  The  dot  product  of  each  discriminant  vector  with  each  group  mean 
vector  determines  the  component  in  3-D  discriminant  space  of  the  location  of  the  group  mean. 
The  last  step  in  the  Method  1  development  algorithm  was  to  apply  the  discriminant  vectors  to 
each  of  the  predictor  vectors  that  were  used  in  the  MVDA  routine.  This  also  “places”  the 
corresponding  image  time  in  discriminant  space.  Its  distance  from  each  of  the  four  group  means 
determines  the  probability  of  the  four  FLI  categories,  with  the  closest  one  being  the  group  with 
the  largest  probability.  The  four  probabilities  sum  to  one,  with  each  indicating  its  likelihood  of 
occurrence.  By  comparing  these  probabilities  with  the  prescribed  FLI  category,  we  got  a 
preliminary  look  at  how  well  the  scheme  was  able  to  discriminate  among  the  flaring  categories. 

The  Method  1  application  algorithm  ingested  the  discriminant  vectors  and  group  mean  vectors 
from  the  development  algorithm.  It  also  input  the  leading  (same  number  as  used  in  the 
development  algorithm)  eigenvector  elements  for  each  image  time  of  each  image  sequence  of  the 
designated  application  set  as  the  predictor  vector  elements.  In  this  study  period,  we  used  a 
combination  of  two  of  the  ISS  from  Table  1  as  the  development  set  and  the  other  image  sequence 
set  as  the  application  set.  The  discriminant  vectors  were  dotted  with  each  application  set 
predictor  vector,  and  the  resulting  location  in  discriminant  space  was  compared  with  the  four 
group  mean  locations  to  determine  the  probabilities  of  the  four  flaring  categories. 

3.  IMAGE  TIME  FLARE  PROBABILITY  DIAGNOSIS  METHOD  2 

A  look  at  some  preliminary  results  from  the  Method  1  algorithms  (not  shown)  indicated  some 
degree  of  overlap  among  the  four  groups.  At  many  FLI  0  image  times  in  an  application  sequence, 
one  of  the  non-zero  FLI  categories  was  diagnosed  with  the  largest  probability.  This  was  often 
due  to  none  of  the  four  categories  having  a  probability  exceeding  0.5,  leading  to  an  ambiguity  in 
the  designation  of  most  likely  flaring  category.  We  felt  that  this  suggested  a  need  to  achieve 
greater  separation  among  the  group  means  in  discriminant  space  to  get  a  more  distinctive 
diagnosis  of  flare  category  probability. 

In  seeking  greater  discrimination  among  the  flaring  categories,  we  experimented  with  using 
alternative  forms  of  the  eigenvectors.  We  saw  that  the  range  of  values  for  a  specific  eigenvector 
would  vary  from  sequence  to  sequence.  Since  we  were  using  multiple  image  sequences  in  the 
development  algorithm,  we  sought  a  more  uniform  representation  of  the  eigenvector  information 
across  sequences.  To  that  end,  we  considered  the  use  of  the  time  rate  of  change  of  the 
eigenvectors  instead  of  their  actual  value  at  each  image  time.  Since  the  image  cadence  is  one 
minute  for  the  ISOON  Ha  images,  we  decided  to  examine  the  use  of  1 -minute  eigenvector 
changes  as  the  predictors. 

We  then  applied  a  five-point  smoother  to  the  time  series  of  each  of  the  nine  leading  eigenvectors, 
then  created  a  scatter  plot  of  one-minute  eigenvector  changes  against  one-minute  x-ray  flux 
changes  for  each  eigenvector  of  selected  image  sequences  with  FLI  categories  0-3.  Examples  are 
shown  in  Figure  1  for  each  of  the  four  FLI  categories. 
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Figure  1.  X-ray  Flux  vs.  Eigenvector  Change  for  FLI  (a)  0,  (b)  1,  (c)  2,  and  (d)  3  Sequences 
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From  the  scatter  plots,  it  is  clear  that  inflections  (that  is,  large  one-minute  changes)  in  certain 
eigenvectors  were  coincident  with  large  rises  (that  is,  positive  changes)  in  x-ray  flux.  In  some 
cases  (especially  FLI 1)  the  sign  of  the  large  eigenvector  changes  was  significant  -  in  others  (FLI 
2,3)  both  signs  were  evident  in  the  large  changes.  From  this,  we  concluded  that  using  one-minute 
changes  in  the  leading  eigenvectors  as  predictors  and  prescribed  FLI  predictands  assigned  during 
x-ray  flux  (not  area-average  Ha)  rise  times  might  enhance  the  discrimination  among  flaring 
categories. 

To  implement  this  idea,  we  made  the  following  changes  to  the  Method  1  development  algorithm: 

•  Eliminate  use  of  eigenvalues  to  determine  the  number  of  eigenvectors  used  as  predictors; 

•  Impose  use  of  a  pre-set  number  of  leading  eigenvectors  -  initially  eigenvectors  0-9; 

•  Change  from  eigenvector  values  to  one-minute  changes  of  smoothed  eigenvectors  as 
predictors; 

•  Eliminate  use  of  sub-region  area-average  Ha  intensity  to  indicate  times  of  flare  rise; 

•  Set  prescribed  non-zero  FLI  as  predictands  at  x-ray  flare  rise  times  coincident  with 
eigenvector  inflections; 

•  Predictor  vector-predictand  pair  exists  at  all  image  times  for  which  the  image  time  one 
minute  prior  is  available. 

As  with  Method  1,  the  x-ray  rise  times  constituted  less  than  10  percent  of  all  image  times  in 
almost  all  FLI  1-3  sequences.  In  the  modified  application  algorithm,  we  also  used  as  the 
predictor  vector  elements  the  pre-set  number  of  leading  eigenvectors  at  all  image  times  in  each 
application  sequence  in  which  the  one-minute  change  of  the  smoothed  eigenvectors  was 
available.  The  application  algorithm  computed  the  dot  product  of  the  discriminant  vectors 
derived  in  the  development  algorithm  and  each  predictor  vector,  resulting  in  the  location  of  the 
discriminant  function  value  for  each  image  time  in  discriminant  space.  It  then  computed  the 
application  probability  of  each  flare  category  its  distance  from  each  of  the  four  group  mean 
locations  in  discriminant  space.  We  designated  the  resulting  development  and  application 
algorithms  the  Image  Time  Flare  Probability  Diagnosis  Method  2. 


4.  ASSESSMENT  OF  FLARE  PROBABILITY  DIAGNOSIS  METHODS  1,  2 

Both  Method  1  and  2  produce  a  diagnosis  of  the  probability  of  the  four  flare  categories  at  each 
image  time  of  an  image  sequence.  In  this  section  we  describe  statistical  metrics  used  to  assess  the 
performance  of  the  two  individual  image  time  flare  probability  diagnosis  methods,  and  present 
results  from  this  assessment. 


We  computed  a  probability- weighted  diagnosed  flaring  category  Cp  given  by 


G- 1 

CPi  =  1 ]8Pgl 


g= o 


[1] 


at  each  image  time  i,  where  in  our  study  there  are  G  =  4  groups  g  =  0,  1,2,  and  3,  and  pg  is  the 
diagnosed  group  probability.  From  Cp  and  C0,  the  observed  category  at  each  image  time  i, 
several  statistical  metrics  were  determined  for  the  image  sequence.  They  are  Brier  Score 


BS  = 


1 


A(G-irtr 


IS£Pt-c0f 
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[2] 


Bias 


1  N 

Bias  = - Y(Cd  -  Ca  ) 

iV(G-l)  ~i  Pi 

and  Diagnosis  Uncertainty 

DU  =  -^  (p  )]. 

N  8  8' 


[3] 

[4] 


In  our  study,  N  represents  the  number  of  image  times  in  the  respective  image  sequence.  In 
addition,  frequency  distribution  fit  (FDF)  defined  by 


1  G-1 

FDF  =  — 


[5] 


was  computed  over  all  image  times  in  all  application  set  sequences  (Na),  where  mg  is  the  number 
of  image  times  in  which  group  g  was  the  most  likely  category  (greatest  diagnosed  probability), 
and  ng  is  the  number  of  observed  group  g  image  times. 


We  executed  the  development  algorithm  of  Image  Time  Flare  Probability  Diagnosis  Methods  1 
and  2  using  successive  pairs  of  the  three  ISS  listed  in  Table  1.  We  then  applied  the  resulting 
discriminant  vectors  to  the  third  (independent)  image  sequence  set  in  the  corresponding 
application  algorithm.  This  yielded  separate  values  of  the  statistical  metrics  for  each  of  the  three 
application  sets  by  image  sequence  and  overall.  In  the  following  illustrations  of  the  results,  we 
show  the  results  from  Methods  1  and  2  in  direct  comparison  for  each  application  set. 


In  Figure  2,  we  show  Brier  Score  results  from  the  two  methods  for  all  three  application  sets.  In 
the  plots,  “D/A”  stands  for  “Diagnosis/ Application.”  We  consider  Brier  Score  to  be  the  best 
single  indicator  of  the  performance  of  the  flare  category  probability  diagnosis  techniques.  It 
measures  the  mean  squared  difference  between  the  probability-weighted  diagnosed  flare 
category  (PWDFC)  and  the  observed  flare  category.  A  perfect  diagnosis  of  the  flare  category  at 
all  image  times  would  yield  a  Brier  Score  of  0.  Brier  Score  increases  as  the  average  difference 
between  PWDFC  and  observed  category  over  the  image  times  in  a  sequence  gets  larger,  with  a 
maximum  (worst)  value  of  1 .  Because  all  four  categories  generally  have  a  non-zero  probability, 
the  PWDFC  tend  to  stay  between  0.5  and  2.5  so  that  we  would  never  expect  either  a  perfect  or 
worst  Brier  Score. 


In  the  results  shown  in  Figure  2,  the  D/A  Method  1  appears  to  perform  somewhat  better  for  most 
image  sequences  in  Application  Sets  1  and  3,  but  Method  2  is  clearly  better  in  Application  Set  2. 
The  two  methods  most  closely  correspond  to  each  other  in  Application  Set  2.  Interestingly,  the 
largest  Brier  Score  values  were  for  the  image  sequences  from  200301 17  in  all  three  application 
sets.  Norquist  and  Balasubramaniam  [1]  found  that  Method  1  and  earlier  flare  category 
probability  diagnosis  algorithms  tended  to  perform  worst  on  prescribed  FLI  0  sequences.  They 
diagnosed  the  largest  probability  for  a  non-zero  FLI  category  in  an  excessive  number  of  image 
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Figure  2.  Brier  Score  of  Methods  1  and  2  for  Application  Sets  (1),  (2)  and  (3) 
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times.  In  this  study,  we  did  not  find  that  to  be  true  -  that  is,  poor  Brier  Scores  were  not  exclusive 
to  the  FLI 0  sequences,  and  many  FLI 0  sequences  had  quite  good  Brier  Scores.  For  example, 
Application  Set  1  image  sequences  200301 17_10250  and  20030604_10373  had  Brier  Scores  of 
0.19  and  0.10  respectively  from  Method  1,  yet  they  were  both  prescribed  FLI  0  sequences.  This 
was  also  true  of  the  FLI  0  sequences  20030605_10375  (BS  of  0.20)  and  200306 10_ 10375  (BS  of 
0.06)  from  Method  2  in  Application  Set  3.  Similarly,  there  was  no  definite  association  between 
the  Brier  Score  and  percentage  of  non-zero  FLI  image  times  in  the  flaring  sequences.  Brier 
Scores  seemed  to  be  insensitive  to  the  number  of  observed  flaring  times  in  the  image  sequence. 

Figure  3  shows  the  Bias  from  the  two  development/application  methods.  It  shows  very  nearly  the 
same  patterns  as  Figure  2  for  Brier  Score,  primarily  because  the  diagnosis  methods 
systematically  diagnose  a  PWDFC  larger  than  the  observed  category  at  the  image  times.  If  there 
were  offsetting  diagnoses  of  positive  and  negative  PWDFC-minus-observed  FLI,  the  Bias  would 
have  a  different  pattern  from  sequence  to  sequence  than  the  Brier  Score.  However,  we  know  that 
can’t  happen  because  as  was  mentioned  earlier,  the  PWDFC  generally  ranges  between  0.5  and 
2.5,  and  the  observed  FLI  (OFLI)  at  image  times  are  dominated  by  zeros.  Therefore,  there  is  an 
inherent  positive  value  to  the  Bias  metric.  To  get  an  idea  of  the  lower  limit  of  Bias  for  a 
prescribed  FLI  0  image  sequence,  we  start  by  noting  that  an  image  time  probability  diagnosis  of 
0.85,  0.05,  0.05,  and  0.05  for  FLI  0-3  respectively  corresponds  to  PWDFC  =  0.3.  If  this 
diagnosis  were  made  at  all  image  times  in  the  sequence,  a  Bias  of  0.10  would  result.  This  would 
seem  to  be  an  approximate  lower  limit  to  the  Bias  for  image  sequences.  More  realistically,  at 
least  for  Methods  1  and  2,  a  diagnosis  of  0.70,  0.15,  0.10,  and  0.05  would  likely  be  the  best  that 
could  be  achieved,  giving  a  PWDFC  =  0.5.  If  repeated  at  all  image  times  in  the  sequence,  a  Bias 
of  about  0.17  would  result,  considered  a  “best”  Bias  for  a  prescribed  FLI  0  sequence  processed 
by  these  methods. 

In  sequences  where  the  number  of  OFLI  0  was  grossly  under-represented  by  the  probability 
diagnosis  (that  is,  where  the  diagnosed  most  likely  category,  DFLI,  was  >  0  at  most  image 
times),  an  excessive  number  of  false  alarms  were  indicated.  For  example,  the  FLI  0  image 
sequence  200301 17_10257  with  a  bias  of  0.51  had  only  70  of  216  image  times  with  DFLI  0  in 
Method  1.  The  other  image  times  had  DFLI  of  1-3,  which  of  course  would  be  considered  false 
alarms.  By  contrast,  in  Method  2  the  FLI  0  sequence  20030620_10385  had  a  bias  of  0.19  with 
306  of  393  image  times  with  DFLI  0,  resulting  in  a  much  smaller  false  alarm  rate. 

The  disparity  of  flare  category  probability  diagnosis  performance  among  ISOON  image 
sequence  begs  the  question  “why  are  some  sequences  diagnosed  so  much  better  than  others?” 
While  it  is  outside  the  scope  of  this  report  to  investigate  this  matter  thoroughly,  we  can  at  least 
look  at  a  couple  of  sequences  with  contrasting  performance  to  suggest  an  answer.  In  Figure  4(a) 
we  show  the  leading  nine  eigenvectors  for  the  FLI  0  image  sequence  200301 17_10250,  while  in 
Figure  4(b)  we  show  the  PWDFC  at  all  image  times  resulting  from  the  Method  2  probability 
diagnosis  (Bias  =  0.48)  based  on  those  eigenvectors.  We  show  the  same  pair  of  plots  for  the  FLI 
0  sequence  20030620_10385  in  Figures  5(a)  and  (b)  respectively  using  the  Method  2  probability 
diagnosis  (Bias  =  0.19).  Figure  4(a)  for  FLI  0  sequence  200301 17_10250  depicts  the  smooth, 
sinusoidal  pattern  commonly  associated  with  FLI  0  sequences.  However,  the  resulting  Method  2 
PWDFC  in  Figure  4(b)  hovers  around  1.5  suggesting  that  the  discriminant  vectors  for 
development  set  2_3  in  this  case  are  placing  the  discriminant  functions  evaluated  from  those 
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Figure  3.  Same  as  in  Figure  2  Except  for  Bias 
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Figure  4.  Sequence  200301 17_10250  (a)  1st  9  Eigenvectors,  (b)  Method  2  PWDFC  vs  OFLI 
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Case  1  _3:  Diagnosed  v.  Observed  Flare  Category  20030620—10385 


Figure  5.  Same  as  in  Figure  4  Except  for  Image  Sequence  20030620_10385 
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eigenvectors  closer  to  groups  1  and  2  (weak  to  moderate  flaring)  than  to  group  0  (non-flaring). 
The  leading  eigenvectors  shown  in  Figure  5(a)  for  FLI  0  sequence  20030620_10385  are 
strikingly  different  than  in  Figure  4(a)  -  flat  with  occasional  spikes,  with  only  a  hint  of 
sinusoidal  variation  in  eigenvectors  6  and  8.  Figure  5(b)  shows  that  the  resulting  probability 
diagnosis  clearly  favors  group  0,  and  in  fact  resulting  in  quite  high  probability  of  the  non-flaring 
category  (not  shown)  at  most  of  the  image  times.  We  can’t  draw  any  hard  and  fast  conclusions 
from  two  cases,  but  there  does  seem  to  be  a  great  deal  of  sensitivity  in  the  diagnosis  performance 
to  the  patterns  of  the  eigenvectors.  This  pair  of  cases  highlights  the  great  variations  of  these 
patterns  associated  with  a  single  flaring  category.  As  such,  it  suggests  that  the  limited 
discrimination  among  the  flaring  categories  may  be  due  to  a  lack  of  distinction  among  the 
eigenvector  patterns  of  the  image  sequences  of  a  particular  prescribed  FLI  category  when  used  in 
development.  This  would  increase  the  in-group  scatter  in  discriminant  space  and  lead  to  less 
distinction  among  the  groups. 

We  also  show  some  results  for  FLI  1-3  image  sequences  from  image  time  flare  category 
diagnosis  Method  2  in  Figures  6-9.  The  leading  nine  smoothed  eigenvectors  (Figure  6(a))  and  the 
diagnosed  PWDFC  vs.  OFLI  (Figure  6(b))  are  shown  for  the  image  sequence  20030522_10362. 
The  diagnosis  yielded  a  Brier  Score  of  0.082  for  this  FLI  1  sequence.  A  C-class  flare  occurred 
between  20  and  22  UTC  that  is  evident  as  inflections  in  several  of  the  leading  eigenvectors.  The 
plot  of  PWDFC  in  Figure  6(b)  shows  quite  a  bit  of  variation  during  the  sequence,  but  tends  to 
remain  around  0.5  except  for  several  jumps  to  1.5  or  so.  The  same  plots  are  shown  for  another 
FLI  1  sequence,  20030528_10368,  in  Figure  7.  This  case  had  C-class  flares  between  14  and  15 
UTC  and  between  20  and  21  UTC  as  shown  in  Figure  7(b).  The  corresponding  inflections  are 
evident  in  some  of  the  leading  eigenvectors  in  Figure  7(a).  However,  in  this  sequence  the 
diagnosis  of  PWDFC  is  greater,  staying  above  1  at  most  of  the  image  times.  As  a  result,  the  Brier 
Score  for  this  sequence  was  not  as  good,  0.175.  In  Figures  8  and  9,  we  show  leading 
eigenvectors  and  Method  2  PWDFC  vs.  OFLI  for  the  FLI  2  sequence  20050506_10758  (Brier 
Score  0.052)  and  the  FLI  3  sequence  20050909_10808  (Brier  Score  0.071).  Both  of  these 
diagnoses  represent  the  flare  well.  This  is  likely  due  to  the  prominent  inflections  in  the  leading 
eigenvectors  and  the  characteristic  eigenvector  patterns  representing  these  two  FLI  categories. 

Figure  10  depicts  the  Diagnosis  Uncertainty  (DU)  over  the  image  sequences  of  the  three 
application  sets.  DU  is  a  measure  of  the  collective  probabilities  of  flaring  categories  not 
identified  with  the  largest  probability.  It  is  the  complement  to  the  degree  of  certainty  of  the  most 
probable  category.  It  does  not  directly  relate  to  PWDFC  accuracy,  like  Brier  Score.  However, 
especially  for  Method  2  we  see  a  strong  similarity  in  the  sequence-to-sequence  variation  of  Brier 
Score  (Figure  2)  and  DU  (Figure  10).  On  the  other  hand,  a  difference  between  the  displays  of 
Brier  Score  and  DU  is  that  the  distinction  between  Methods  1  and  2  in  Application  Set  3  for 
Brier  Score  is  greater  for  DU.  It  appears  that  greater  certainty  of  the  most  probable  category  is 
generally  associated  with  category  diagnosis  accuracy.  The  levels  of  DU  for  both  methods  in  all 
put  perhaps  Application  Set  2  are  disappointing.  A  good  diagnosis  would  have  a  higher  than  50 
percent  probability  of  the  most  probable  category.  We  see  in  Figure  10  that  in  many  sequences 
the  average  DU  approaches  or  exceeds  50  percent.  In  only  6  of  the  30  sequences  in  Application 
Set  2  does  the  DU  get  below  30  percent,  or  a  most  likely  probability  of  70  percent  that  we 
suggested  earlier  might  be  the  best  we  could  do  for  PWDFC  in  these  methods  across  a  sequence. 
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Case  1^3:  Diagnosed  v.  Observed  Flare  Category  20030522_10362 
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Figure  6.  Same  as  in  Figure  4  Except  for  Image  Sequence  20030522_10362 
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Case  2-5\  Diagnosed  v.  Observed  Flare  Category  20030528—10368 


Figure  7.  Same  as  in  Figure  4  Except  for  Image  Sequence  20030528_10368 
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Figure  8.  Same  as  in  Figure  4  Except  for  Image  Sequence  20050506_10758 
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Figure  9.  Same  as  in  Figure  4  Except  for  Image  Sequence  20050909_10808 
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Figure  10.  Same  as  in  Figure  2  Except  for  Diagnosis  Uncertainty 
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In  addition  to  the  statistical  metrics  computed  for  each  image  sequence  as  shown  in  Figures  2,  3 
and  10,  we  present  their  values  determined  from  all  image  times  over  all  sequences  in  each 
application  set  in  Table  2. 


Table  2.  Methods  1,  2  Statistical  Metrics  for  All  Image  Times  in  Each  Application  Set 


Application  Set  1 

Application  Set  2 

Application  Set  3 

Method  1 

Method  2 

Method  1 

Method  2 

Method  1 

Method  2 

Brier  Score 

0.12 

0.15 

0.15 

0.10 

0.10 

0.13 

Bias 

0.31 

0.35 

0.35 

0.27 

0.28 

0.31 

DU 

0.44 

0.48 

0.45 

0.39 

0.45 

0.43 

FDF 

0.89 

0.92 

0.87 

0.56 

0.66 

0.82 

Method  2  performed  the  best  overall  on  Application  Set  2  (Development  Set  1_3)  followed  by 
Method  1  on  Application  Set  3  (Development  Set  1_2).  Both  methods  varied  in  their  diagnosis 
skill  among  the  application  sets,  with  Method  1  slightly  better  than  Method  2  on  Application  Sets 
1  and  3  and  Method  2  somewhat  better  than  Method  1  on  Application  Set  2.  Based  on  these 
overall  results,  there  is  not  a  strong  argument  for  one  method  over  the  other  in  flare  category 
diagnosis  performance.  We  would  only  recommend  Method  2  in  that  it  is  simpler  to  implement, 
does  not  rely  on  area-average  Ha  intensity,  and  achieved  the  best  overall  performance. 

5.  WHOLE  SEQUENCE  ALGORITHM  DEVELOPMENT/APPLICATION 

In  an  attempt  to  improve  the  discrimination  among  flaring  categories,  we  next  took  a  radically 
different  approach.  Instead  of  trying  to  diagnose  flare  category  for  each  individual  image  time  as 
done  to  this  point,  we  arrived  at  the  idea  of  attempting  to  diagnose  flare  category  for  each 
sequence  as  a  whole.  The  idea  came  from  the  observation  that  the  leading  eigenvectors  did  seem 
to  fall  into  four  distinct  patterns  as  described  above,  with  each  pattern  associated  with  a 
corresponding  flaring  level  indicator  (FLI).  We  decided  to  represent  the  distinct  pattern 
numerically  in  the  MVDA  predictor  vector  elements  for  each  sequence,  and  assign  the  FLI  for 
that  sequence  as  the  predictand  category.  The  representation  was  in  the  form  of  a  frequency 
distribution  of  the  1 -minute  changes  for  each  of  the  leading  eigenvectors. 

The  frequency  distributions  were  constructed  for  all  60  image  sequences  in  each  development  set 
combination  of  ISS  pairs.  The  Ni,  1 -minute  change  bins  were  established  separately  for  each  of 
Ne  leading  eigenvectors,  resulting  in  Ne  x  Ni,  fractional  frequency  of  occurrence  values  that  were 
the  predictors  for  each  sequence.  The  predictand  was  the  prescribed  FLI  value  based  on  the 
eigenvector  patterns  and  x-ray  peak  flux  as  done  in  the  earlier  methods.  To  set  the  size  bins  for 
each  eigenvector,  we  sorted  all  of  the  1 -minute  changes  over  all  60  sequences  from  largest 
negative  to  largest  positive  values.  We  then  found  the  first  and  99th  percentile  values,  and  used 
them  to  set  outer  bounds  for  the  range  of  values.  The  outer  bounds  were  set  to  the  next  integer 
value  less  than  the  first  and  greater  than  the  99th  in  the  same  power  of  10.  For  example,  if  the 
first  and  99th  percentile  values  were  -0.00467  and  0.00631  respectively,  we  set  the  outer  bounds 
at  -0.005  and  0.007  respectively.  We  then  divided  this  range  into  Nb  equally  sized  bins.  For  each 
sequence,  we  counted  the  number  of  1 -minute  changes  that  fell  into  each  size  bin  corresponding 
to  each  of  the  Ne  eigenvectors,  and  divided  each  count  by  the  total  number  of  1 -minute  changes 
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in  the  sequence.  The  resulting  frequency  of  occurrence  values  for  each  of  the  leading  Ne 
eigenvectors  made  up  the  predictor  vector  elements  for  that  sequence.  In  the  same  way,  we 
computed  the  frequency  of  occurrence  for  the  Ni,  size  bins  for  each  eigenvector  over  all  60 
sequences,  separately  by  FLI  category.  The  latter  accounting  was  done  simply  to  assess  the 
overall  differences  in  frequency  distribution  of  the  1 -minute  changes  by  FLI  category. 

The  60  predictor  vector  -  predictand  sets  were  entered  into  the  MVDA  development  algorithm  to 
determine  the  three  discriminant  vectors  of  Ne  x  Ni,  elements.  These  were  then  applied  back  to 
the  60  predictor  vectors  used  to  develop  them,  just  to  test  the  MVDA  algorithm  process.  The 
discriminant  space  position  of  the  resulting  discriminant  function  values  for  each  FLI  category 
was  compared  to  the  position  of  the  four  group  means  for  each  sequence,  and  the  probability  of 
each  category  was  diagnosed.  The  result  was  perfect  discrimination  -  the  probability  of  the 
original  FLI  predictand  value  category  was  one,  and  the  probabilities  of  the  other  three 
categories  were  zero.  This  is  because  each  predictor  vector  to  which  the  discriminant  vector  was 
applied  was  a  predictor  vector  used  to  develop  the  discriminant  vector,  so  the  development  and 
application  steps  were  validated.  We  settled  on  the  use  of  Ne  =  8  leading  eigenvectors  and  Nh  = 

10  size  bins,  resulting  in  80  predictor  elements  used  in  the  MVDA  development  algorithm. 

We  then  applied  the  discriminant  vectors  derived  from  a  particular  development  set  of  60  image 
sequences  to  an  application  set  of  30  independent  image  sequences.  The  three  separate  sets  of  30 
image  sequences  used  in  this  report  were  listed  in  Table  1.  For  example,  we  derived  discriminant 
vectors  from  development  set  1_2,  made  up  of  60  image  sequence  combined  from  sets  1  and  2, 
and  applied  them  to  application  set  3.  In  the  application  algorithm,  the  predictor  vectors  are 
formed  in  the  same  way  as  described  above,  using  the  Ne  =  8  leading  eigenvectors  and  Nh  =  10 
size  bins  to  develop  the  frequency  of  occurrence  values  for  each  sequence  in  the  application  set. 
The  discriminant  vectors  from  the  development  set  are  then  applied  to  the  predictor  vector  of 
each  sequence  in  the  application  set.  The  position  in  discriminant  space  from  the  resulting 
discriminant  function  values  is  compared  with  the  position  of  the  development  set  discriminant 
function  means  of  each  category.  The  distance  from  each  one  determines  the  probability  of  each 
flare  category.  The  closer  the  application  diagnosis  position  is  to  one  of  the  group  means’ 
position,  the  higher  the  probability  and  the  lower  the  diagnosis  uncertainty.  However,  if  there  is 
not  a  lot  of  distinction  among  the  development  set  discriminant  function  means,  there  is  a  greater 
ambiguity  in  the  application  diagnosis,  and  the  diagnosis  uncertainty  is  greater.  Thus,  it  is  the 
goal  of  the  multivariate  discriminant  analysis  to  maximize  the  separation  among  the  group 
means,  and  to  minimize  the  scatter  of  the  development  set  discriminant  function  values  within 
each  group  or  category.  The  procedure  will  succeed  or  fail  depending  on  these  factors,  so  the 
trick  is  to  design  the  predictors  and  predictands  (that  is,  the  categories)  to  optimize  these  factors. 

We  first  assess  the  performance  of  the  whole  sequence  development  and  application  algorithms 
by  looking  at  the  probability-weighted  diagnosis  of  flare  category  (PWDFC).  Unlike  the  earlier 
methods  that  diagnose  the  probability  of  each  FLI  category  and  thus  the  PWDFC  at  each  image 
time,  here  we  determine  PWDFC  from  the  four  category  probabilities  diagnosed  for  the  entire 
image  sequence.  In  Figure  11,  we  show  the  diagnosed  PWDFC  from  the  whole  sequence  method 
for  all  sequences  in  each  of  the  three  application  sets.  For  reference,  we  also  show  the  observed 
FLI  (OFLI)  for  each  sequence  in  each  of  the  application  sets.  We  see  that  the  majority  of  the 
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Figure  11.  Whole  Sequence  Method  PWDFC  vs.  OFLI  for  Application  Sets  (1),  (2),  (3) 
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sequences  have  diagnosed  PWDFC  between  1  and  2,  with  occasional  diagnoses  above  or  below 
this  range.  By  contrast,  OFLI  flare  categories  0  and  1  predominate  in  the  sequences  in  all  three 
application  sets,  with  only  a  small  minority  of  sequences  having  OFLI  2  or  3.  Average  PWDFC- 
OFLI  difference  for  the  three  application  sets  were  0.8,  0.9,  and  0.9  respectively.  There  does  not 
appear  to  be  any  perceptible  correspondence  between  the  sequence-to- sequence  variation  of 
PWDFC  and  OFLI.  For  the  1 1  sequences  with  OFLI  2  or  3,  only  one  (20041 109_10696  in 
Application  Set  2)  had  a  corresponding  rise  in  flare  category  to  a  comparable  value. 

We  list  the  performance  metrics  results  for  the  whole  sequence  development  and  application 
algorithms  in  Table  3.  In  this  case,  N  =  30,  the  number  of  image  sequences  in  each  application 
set.  The  Brier  Score  and  Bias  are  computed  from  the  values  shown  in  the  PWDFC  vs.  OFLI  plots 
in  Figure  11.  Recall  that  Brier  Score  and  Bias  are  scaled  to  the  range  0  to  1,  so  that  that  flare 
category  bias  is  three  times  the  Bias  metrics  in  Table  3.  The  metrics  indicate  a  consistency 
among  the  three  separate  application  sets,  implying  the  same  for  the  development  sets  from 
which  the  application  discriminant  vectors  were  derived.  Since  the  metrics  here  are  computed 
over  30  sequences  for  each  application  set  rather  than  thousands  of  individual  image  times  as  in 
the  previous  sections  of  this  report,  we  can’t  make  a  direct  comparison  with  the  metrics  values  in 
Table  2.  It  is  clear  from  Table  3  metrics  that  the  whole  sequence  development/application 
algorithms  produce  a  consistent  positive  flare  category  bias.  This  is  indicated  by  both  the  Bias 
and  the  comparison  of  percentage  of  sequences  with  diagnosed  FLI 0  (%  DFLI 0)  and  observed 
FLI  0  (%  OFLI  0)  entries  in  Table  3.  The  diagnosed  flare  categories  DFLI  are  designated  by  the 
category  with  the  largest  diagnosed  probability  for  each  sequence.  Table  3  values  indicate  that 
the  number  of  FLI  0  sequences  is  greatly  under-represented  in  the  diagnoses,  meaning  that  there 
are  an  excessive  number  of  flare  false  alarms.  Diagnosis  uncertainty  values  in  Table  3,  each  of 
which  is  the  individual  sequence  diagnosis  uncertainty  averaged  over  all  sequences  in  the 
respective  application  set,  show  that  the  highest  diagnosed  probability  is  on  average  less  than 
0.5.  This  means  that  there  is  less  of  a  chance  that  the  diagnosed  most-likely  category  is  correct 
than  that  it  is  incorrect.  Finally,  the  large  frequency  distribution  fit  (FDF)  values  show  that  the 
diagnosed  FLI  categories  for  the  sequences  poorly  represented  the  actual  observed  FLI  frequency 
distribution.  Improperly  diagnosed  FLI  0  sequences  contributed  greatly  to  this  shortcoming. 

Table  3.  Performance  Metrics  for  the  Whole  Sequence  Development/Application  Process 


Application  Set 

Brier  Score 

Bias 

Diagnosis  Uncertainty 

FDF 

%  DFLI  0 

%  OFLI  0 

1 

0.16 

0.27 

0.56 

1.00 

16.67 

60.00 

2 

0.17 

0.29 

0.54 

0.93 

13.33 

56.67 

3 

0.16 

0.29 

0.62 

0.80 

30.00 

56.67 

To  gain  a  better  understanding  of  how  the  frequency  distribution  of  1 -minute  eigenvector 
changes  may  have  affected  the  diagnosis  of  FLI,  we  show  the  frequency  distribution  of  the  eight 
leading  eigenvectors  derived  from  the  image  sequences  of  development  set  1_2  in  Figure  12.  The 
frequency  distributions  for  the  eigenvectors  are  shown  separately  for  each  FLI.  Keep  in  mind 
that  the  size  bins  depend  only  on  eigenvector  and  not  on  FLI,  so  they  are  the  same  in  all  FLI 
categories  for  a  given  eigenvector.  We  can  immediately  see  a  similarity  between  FLI  0  and  1, 
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Figure  12.  Whole  Sequence  Method  Frequency  Distribution  for  Development  Set  1_2 
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Figure  12.  (Cont.) 
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and  between  2  and  3.  In  FLI 0  and  1,  the  two  smallest  size  bins  have  frequencies  of  occurrence 
of  less  than  0.5  for  all  eigenvectors,  and  0.05  -  0.15  for  the  next  larger  size  bins  on  either  size  of 
zero.  By  contrast,  in  FLI  2  and  3  the  frequency  of  occurrence  extends  well  beyond  0.5  for  some 
eigenvectors,  and  the  two  smallest  size  bins  are  much  less  symmetric  about  zero  than  in  FLI  0 
and  1.  The  larger  size  bins  have  smaller  frequencies  of  occurrence  for  all  eigenvectors  in  FLI  2 
and  3,  particularly  bin  -3,  bin  -2,  bin  2  and  bin  3.  Because  these  frequency  distributions  are  the 
basis  for  the  discriminant  vectors  derived  by  the  MVDA  algorithm,  we  would  expect  a  greater 
separation  between  FLI  0-1  and  FLI  2-3  than  we  would  between  0  and  1  and  between  2  and  3. 

When  we  diagnosed  FLI  for  the  image  sequences  of  application  set  3,  we  applied  the  derived 
discriminate  vectors  from  development  set  1_2  to  the  frequency  distributions  of  the  1 -minute 
eigenvector  changes  from  each  application  image  sequence.  The  frequency  distributions  from 
four  selected  sequences  of  application  set  3  are  shown  in  Figure  13.  The  chosen  images 
sequences  were:  20030203.10274  (OFLI  0),  20030528.10365  (OFLI  1),  20050506.10758 
(OFLI  2),  and  20061206.10930  (OFLI  3).  The  application  of  the  discriminate  vector  to  these 
frequency  distributions  of  the  1 -minute  changes  of  eight  leading  eigenvectors  yields  a 
discriminant  function  value  in  discriminant  space.  A  comparison  of  its  position  with  respect  to 
means  of  the  four  FLI  groups  determines  the  probability  of  each  group.  In  the  four  charts  of 
Figure  13,  we  indicate  the  diagnosed  FLI  (DFLI)  and  the  observed  FLI  (OFLI)  for  the  four 
selected  sequences  whose  frequency  distributions  are  displayed.  We  would  expect  that  the  FLI 
category  of  the  development  set  having  the  1 -minute  change  eigenvector  frequency  distributions 
most  similar  to  that  of  the  particular  application  sequence  would  have  been  the  one  diagnosed  for 
that  sequence.  We  do  see  the  greater  frequencies  of  occurrence  concentrated  in  the  two  smallest 
size  bins  in  the  OFLI  2  and  3  application  sequences  as  expected.  The  fact  that  the  OFLI  0  and  1 
application  sequences  have  substantial  frequencies  of  occurrence  in  bin  -2  and  2  unlike  the  OFLI 
2  and  3  sequences  should  have  placed  these  sequences  closer  to  FLI  0  and  1  in  the  diagnosis. 
However,  this  was  clearly  not  the  case  -  both  were  diagnosed  in  discriminant  space  as  lying 
closest  to  the  group  3  mean,  but  not  by  much.  In  fact,  the  probability  of  FLI  3  was  only  0.34  and 
0.33  respectively,  indicating  a  diagnosis  uncertainty  of  0.66  and  0.67.  For  the  other  two  selected 
application  sequences  with  DFLI  0  and  3  and  OFLI  2  and  3,  their  probability  of  the  DFLI 
categories  were  0.29  and  0.53,  or  diagnosis  uncertainty  of  0.71  and  0.47.  Clearly  the  level  of 
discrimination  among  the  FLI  categories  is  low,  and  there  is  excessive  ambiguity  in  the  FLI 
classification  of  any  application  sequence. 
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Figure  13.  Same  as  Figure  12  Except  for  Four  Image  Sequences  of  Application  Set  3 
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Figure  13.  (Cont.) 

28 

Approved  for  public  release;  distribution  is  unlimited. 


There  are  two  major  reasons  why  the  discrimination  is  lacking  in  this  approach  to  classification 
of  image  sequences  by  FLI.  First,  there  is  too  much  similarity  among  the  1 -minute  eigenvector 
change  frequency  distributions  partitioned  by  FLI  as  shown  in  Figure  12.  The  discriminant 
functions  of  the  FLI  group  means  lie  too  close  together  in  discriminant  space  to  indicate  a  clear 
distinction  among  them.  The  other  major  reason  is  that  the  1-minute  eigenvector  change 
frequency  distribution  of  any  given  application  sequence  is  too  dissimilar  to  any  of  the  frequency 
distributions  by  FLI  of  the  development  sequences.  Both  of  these  possibilities  suggest  there  is 
perhaps  too  much  variation  in  the  eigenvector  patterns  among  the  sequences  of  a  particular  FLI 
category.  This  precludes  a  distinct  frequency  distribution  derived  from  the  development 
sequences  for  each  FLI  category,  and  no  strong  resemblance  of  an  application  sequence 
frequency  distribution  to  any  of  the  FLI  categories’  frequency  distributions. 

6.  PRE-FLARE  ALGORITHM  DEVELOPMENT/APPLICATION 

In  looking  again  at  the  eigenvector  patterns  by  FLI  group  as  exemplified  by  the  selected  image 
sequences  in  Figure  6-9,  we  see  that  the  time  period  before  the  flare  exhibits  the  greatest 
differences.  After  the  flare,  there  are  large  oscillations  in  FLI  1-3  that  could  be  confused  with  the 
sinusoidal  swings  present  in  FLI  0  (see,  for  example,  Figure  4).  We  thought  that  by  restricting 
the  development  of  the  discriminant  vectors  to  just  the  pre-flare  1 -minute  eigenvector  changes  in 
the  sequences  with  OFLI  1-3,  we  might  be  able  to  enhance  the  discrimination  among  the  FLI 
groups.  That  is,  we  may  get  greater  separation  among  the  means  of  the  discriminant  functions  of 
the  four  FLI  groups.  We  revised  the  development  algorithm  to  incorporate  the  method  used  in 
the  Ha  eigenvector  -  x-ray  flux  algorithm  that  determined  the  start  time  of  the  x-ray  flux  rise 
associated  with  the  prescribed  flare  in  each  OFLI  1-3  sequence.  We  then  used  only  the  1 -minute 
eigenvector  changes  from  the  sequence  start  time  up  to  that  flare  rise  start  time  to  determine  the 
frequency  distributions.  Any  sequence  that  had  fewer  than  100  1 -minute  change  times  before 
flare  rise  start  was  not  allowed  to  contribute  to  the  frequency  distributions,  and  thus  was  not 
involved  in  determining  the  discriminant  vectors  in  the  MVDA  algorithm.  For  OFLI  0 
sequences,  we  continued  to  use  all  of  the  1 -minute  eigenvector  changes  for  the  entire  sequence  in 
the  frequency  distributions.  Therefore,  the  frequency  distributions  determined  over  all 
development  sequences  in  a  given  development  set  will  be  the  same  as  before  for  FLI  0,  but 
should  be  more  distinct  for  FLI  1-3. 

In  Figure  14,  we  show  the  frequency  distributions  determined  from  the  altered  development 
algorithm  for  the  development  set  1_2.  These  can  be  compared  with  their  counterparts  from  the 
whole  sequence  algorithm  in  Figure  12,  keeping  in  mind  changes  in  the  size  bin  boundaries  for 
some  of  the  eigenvectors.  The  development  data  set  had  the  following  numbers  of  1 -minute 
changes  that  went  into  the  frequency  distributions  by  FLI  category:  for  the  original  whole 
sequence  algorithm:  16315,  8669,  1329,  1796  for  FLI  0-3;  for  the  pre-flare  algorithm:  16315, 
2533,  624,  982  for  FLI  0-3.  Of  the  60  image  sequences  in  development  set  1_2,  54  met  the 
minimum  100  1 -minute  change  time  requirement  to  participate  in  the  algorithm.  Looking  at  the 
new  frequency  distributions  in  Figure  14,  we  see  that  as  with  the  whole  sequence  algorithm,  the 
FLI  0  and  1  frequency  distributions  are  quite  similar.  Both  the  FLI  2  and  3  are  more  different 
from  the  FLI  0  and  1  frequency  distributions  than  they  were  for  the  whole  sequence  algorithm. 
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Figure  14.  Pre-Flare  Method  Frequency  Distribution  for  Development  Set  1_2 
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Figure  14.  (Cont.) 
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More  of  the  frequency  of  occurrence  is  concentrated  in  the  bin  -1  and  1  bins,  and  some  of  the 
outlying  bins  have  very  little  representation  by  any  of  the  eigenvectors.  The  distributions  for  FLI 

2  and  3  are  different  in  that  there  is  more  variation  with  eigenvector  in  the  two  central  bins  in  the 
latter.  This  was  also  true  for  FLI  2  and  3  in  the  whole  sequence  algorithm.  It  is  hard  to  determine 
to  what  extent  we  achieved  our  goal  of  greater  distinction  among  the  frequency  distributions  of 
the  four  FLI  categories  at  this  point  -  only  that  it  seems  we  achieved  a  greater  FLI  0-1  and  FLI  2- 

3  difference. 

We  then  applied  the  discriminant  vectors  from  the  pre-flare  development  algorithm  to  the 
respective  application  sets.  We  maintained  the  restriction  of  using  only  the  pre-flare  image  times 
to  construct  the  frequency  distribution  of  1 -minute  eigenvector  changes  for  each  OFLI  1-3 
application  sequence.  For  the  OFLI  0  sequences,  we  used  the  image  times  before  the  hour 
nearest  the  mid-time  of  the  sequence  period.  We  diagnosed  FLI  category  probabilities  only  for 
the  application  sequences  that  had  at  least  180  image  times  resulting  from  these  restrictions.  This 
eliminated  nine  of  the  30  application  sequences  from  being  diagnosed  for  FLI  category 
probability.  We  selected  the  same  four  sequences  from  application  set  3  to  show  the  frequency 
distribution  of  1 -minute  eigenvector  change  frequency  distributions  in  Figure  15.  For 
perspective,  the  number  of  1 -minute  changes  making  up  the  frequency  distributions  of  the  four 
sequences  were  441,  643,  444,  484  for  the  OFLI  0-3  sequences  respectively  for  the  whole 
sequence  algorithm  (whose  frequency  distributions  are  shown  in  Figure  13),  and  244,  286,  193, 
199  for  the  OFLI  0-3  sequences  respectively  for  the  pre-flare  algorithm  (whose  frequency 
distributions  are  shown  in  Figure  15).  We  gained  the  perceived  advantage  of  greater  distinction 
by  FLI  category  in  the  development  set  frequency  distributions,  but  suffered  the  disadvantage  of 
fewer  1 -minute  change  values  going  into  the  frequency  distributions  in  both  development  and 
application.  Frankly,  the  OFLI  0  application  sequence  frequency  distribution  in  Figure  15  is  very 
irregular  and  does  not  resemble  any  of  the  four  frequency  distributions  of  the  development 
sequences  as  shown  in  Figure  14.  Eigenvectors  1-3  have  smaller  frequencies  of  occurrence  in 
bins  -1  and  1  for  the  OFLI  1  application  sequence  (Figure  15)  than  for  the  OFLI  1  development 
sequences  (Figure  14).  Even  so,  the  collective  frequency  distributions  overall  clearly  resemble 
FLI  1  more  than  FLI  3  (see  Figure  14),  yet  the  latter  was  indicated  as  most  likely.  The  OFLI  2 
sequence  (Figure  15)  is  a  complete  mystery  to  us  -  if  anything,  it  looks  more  like  the  FLI  3 
frequency  distribution  of  the  development  sequences,  certainly  not  FLI  1  which  was  what  was 
diagnosed  as  most  likely.  Finally,  it  is  more  understandable  that  the  OFLI  3  application  sequence 
could  be  confused  with  the  FLI  2  development  sequences’  frequency  distributions,  since  they  are 
similar  to  the  FLI  3  frequency  distributions. 
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Figure  15.  Same  as  Figure  14  Except  for  Four  Image  Sequences  of  Application  Set  3 
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Figure  15.  (Cont.) 
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It  appears  that  in  employing  the  pre-flare  version  of  the  development  and  application  algorithms, 
we  suffer  from  a  smaller  pool  of  1 -minute  eigenvector  changes  (minimum  of  180)  from  which  to 
construct  the  frequency  distributions  for  each  application  sequence.  Therefore,  they  are  less 
representative  of  the  frequency  distributions  created  from  the  1 -minute  eigenvector  changes  of 
the  development  sets’  sequences.  We  did  not  seem  to  profit  from  focusing  on  just  the  pre-flare 
eigenvector  information,  even  though  it  was  more  distinct  by  flare  category  than  using  1 -minute 
eigenvector  changes  from  the  entire  sequences. 

In  Figure  16  we  show  the  probability-weighted  diagnosed  flaring  categories  for  those  sequences 
for  which  the  minimum  180  image  times  was  satisfied  vs.  the  OFLI.  In  comparing  these  plots 
with  those  from  the  whole  sequence  approach  in  Figure  1 1,  we  see  that  again  most  of  the 
PWDFC  values  are  in  the  flaring  category  1-2  domain.  The  pre-flare  algorithms  did  not  improve 
the  ability  to  properly  diagnose  the  FLI  0  (non-flaring)  sequences.  Also  as  with  the  whole 
sequence  method,  there  is  no  obvious  association  of  sequence-to- sequence  variation  between 
PWDFC  and  OFLI.  We  list  the  performance  metrics  results  for  the  pre-flare  development  and 
application  algorithms  in  Table  4.  Numbers  in  parentheses  in  the  “Application  Set”  column 
denote  the  number  of  application  set  sequences  (out  of  30)  that  had  at  least  180  image  times  to 
construct  the  frequency  distributions.  The  results  for  application  sets  1  and  3  are  actually  worse 
than  for  the  whole  sequence  method  as  shown  in  Table  3,  while  application  set  2  has  values  that 
are  competitive.  The  level  of  positive  bias,  serious  lack  of  FLI  0  diagnoses,  and  the  large  FDF  in 
application  sets  1  and  3  suggest  that  the  liability  of  having  fewer  image  times  to  construct  the 
frequency  distributions  overcame  the  advantage  of  greater  category  distinction  in  the  use  of  just 
pre-flare  image  times. 
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Figure  16.  Pre-Flare  Method  PWDFC  vs.  OFLI  for  Application  Sets  (1),  (2),  (3) 
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Table  4.  Performance  Metrics  for  the  Pre-Flare  Development/Application  Process 


Application  Set 

Brier  Score 

Bias 

Diagnosis  Uncertainty 

FDF 

%  DFLI  0 

%  OFLI  0 

1(15) 

0.22 

0.41 

0.54 

1.73 

0.00 

86.67 

2(21) 

0.17 

0.24 

0.59 

0.29 

57.14 

66.67 

3  (21) 

0.21 

0.28 

0.56 

1.24 

4.76 

66.67 

7.  ISOON  DOPPLER  VELOCITY  AS  FLARE  PREDICTORS 

At  the  end  of  the  current  study  period,  we  considered  the  use  of  Doppler  velocity  signal  in  the 
ISOON  image  pixels  as  a  possible  way  to  augment  the  flare  indicator  information  from  the  Ha 
images.  At  one-minute  intervals,  the  plasma  speed  in  the  chromosphere  toward  (away  from)  the 
viewer  is  represented  by  blue  (red)  shift  in  the  wavelength  of  received  light.  The  degree  of  shift 
from  the  center  line  wavelength  indicates  the  speed,  which  we  refer  to  as  the  Doppler  velocity. 
We  processed  one-minute  interval  Doppler  velocity  in  each  ISOON  image  pixel  over  the  same 
sub-regions  of  solar  active  regions  processed  for  ISOON  Ha.  Processing  included  extracting  the 
data  in  the  sub-regions  from  the  whole  disk  image.  After  extraction,  the  images  were  normalized, 
spatially  oriented  and  aligned.  We  then  applied  principal  component  analysis  (PCA,  e.g.,  see 
Wilks  [3])  to  each  of  the  sub-region  sequences  of  one-minute  interval  Doppler  velocity  grids.  In 
this  way,  the  eigenvectors  and  eigenvalues  were  derived  in  a  manner  directly  analogous  to  the 
processing  conducted  on  the  one-minute  interval  Ha  image  sequences. 

In  Figures  17-20,  we  show  the  nine  leading  eigenvectors  and  the  cumulative  explained  variance 
as  derived  from  all  of  the  eigenvalues  for  four  image  sequences.  These  are  the  same  sequences 
for  which  we  showed  the  leading  Ha  eigenvectors  in  Figures  4,  6,  8,  and  9  respectively.  A  five- 
point  smoother  has  been  applied  to  the  eigenvectors  in  both  sets  of  figures.  Note  the  difference  in 
the  time  span  of  the  data  between  the  Ha  and  Doppler  velocity  figures. 

In  comparing  the  leading  Ha  (Figure  4(a))  and  Doppler  velocity  (Figure  17(a))  eigenvectors  from 
FLI  0  image  sequence  200301 17_10250,  we  see  a  pervasive  sinusoidal  pattern  in  the  former 
which  only  becomes  apparent  with  eigenvector  2  and  higher  in  the  latter.  With  the  imposed 
smoother  acting  on  both  sets,  it  is  clear  that  there  is  much  more  high  frequency  variation  in  the 
Doppler  velocity  eigenvectors  than  for  the  Ha  eigenvectors.  This  may  also  be  reflected  in  the 
explained  cumulative  variance  shown  for  the  Doppler  velocity  in  Figure  17(b).  While  we  found 
that  99.9%  of  the  explained  variance  was  achieved  consistently  by  about  the  first  50  Ha 
eigenvectors  and  often  many  fewer  than  that,  here  we  see  that  for  this  sequence  it  is  only  attained 
at  the  highest  order  eigenvectors.  The  curving  Ha  eigenvector  patterns  of  Figure  6(a)  contrast 
with  the  flat  Doppler  velocity  eigenvectors  shown  in  Figure  18(a)  for  the  FLI  1  sequence 
20030522_10362.  With  the  plentiful  vertical  spikes  in  the  Doppler  velocity  eigenvector  curves  it 
is  difficult  to  associate  any  particular  inflection  with  the  obvious  flare  inflection  between  20  and 
22  UTC  in  the  Ha  eigenvectors.  Comparing  Ha  eigenvectors  in  Figure  8(a)  with  Doppler 
velocity  eigenvectors  in  Figure  19(a)  for  the  FLI  2  sequence  20050506_10758  reveals  an 
extremely  different  signature.  The  Ha  eigenvectors  show  the  classic  FLI  2  gently  curving  pre- 
flare  shape  with  the  deep  drop  and  rise  associated  with  the  flare.  The  Doppler  velocity 
eigenvectors  display  no  obvious  sign  of  the  flare  occurring  between  16  and  18  UTC,  and  in  fact 
their  sinusoidal  patterns  are  similar  to  those  of  the  FLI  0  Ha  eigenvectors.  Even  more  dramatic  is 
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Cumulative  Variance:  200301 17_1  0250 


0  100  200  300  400  500 


m 


Figure  17.  200301 17_10250  Doppler  Velocity  (a)  1st  9  Eigenvectors,  (b)  Explained  Variance 

38 

Approved  for  public  release;  distribution  is  unlimited. 


D.lfl 

D.1  □ 

Dxfi 

C. JJ3 

-dxo 

c-J2a 

c-.ia 

d.io 

fc05 

TaCO 

-cxa 

C-.lfl 

D. i  a 

DXC 

DXO 

-Dxa 


1  23  4  D  IV  7  B  Simian -can  71  !JiaLQ:TZZn!4 


i  z  a  4  d  a  7  a  ei  n  lia  a  -ca  n  ti .113113:1223 4 


l  2J4  5&7aeinnaa4iaa7i£iguzEZZ3!4 
Hex  (UTC) 


(a) 


Cumulative  Variance:  20050909_1  0808 


Figure  20.  Same  as  in  Figure  17  Except  for  20050909_10808 
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the  slow  rise  of  cumulative  explained  variance  as  shown  in  Figure  19(b).  About  300  Doppler 
velocity  eigenvectors  are  required  to  represent  95%  of  the  variance.  Finally,  in  comparing  the  Ha 
and  Doppler  velocity  eigenvectors  for  FLI  3  sequence  20050909_10808  in  Figures  9(a)  and 
20(a)  respectively,  we  see  a  similar  relationship  between  the  patterns.  The  very  clear  signature  of 
the  flare  in  the  Ha  eigenvectors  is  largely  absent  in  the  Doppler  velocity  eigenvectors,  though  we 
do  see  a  large  rise  in  eigenvector  6  at  about  the  expected  rise  time  of  the  flare.  In  this  case  we  see 
about  95%  of  the  variance  explained  by  the  first  200  eigenvectors. 

It  appears  from  this  admittedly  limited  number  of  examples  that  more  of  the  temporal  variation 
in  the  Doppler  velocity  is  carried  at  much  higher  order  eigenvectors  than  for  Ha.  This  seems  to 
suggest  that  the  Doppler  velocity  is  a  more  spatially  and  temporally  variant,  or  noisier,  field  than 
the  Ha  imagery.  We  found  in  this  study  that  the  variation  in  eigenvector  patterns  among  image 
sequences  of  a  particular  FLI  category  limit  the  ability  of  the  MVDA  flare  probability  diagnosis 
technique.  We  would  surmise  that  the  even  more  variant  Doppler  velocity  would  not  benefit  the 
process.  In  addition,  it  appears  that  many  more  eigenvectors  would  be  required  to  capture  enough 
of  the  explained  variance  to  fully  represent  the  physical  phenomenon.  It  may  be  that  the  Doppler 
velocity  may  have  information  to  contribute  to  flare  diagnosis  and  maybe  even  prediction. 
However,  investigating  this  potential  benefit  would  require  another  project  on  the  scale  of  the 
present  one  conducted  on  Ha  imagery.  It  is  clear  from  this  limited  analysis  that  Doppler  velocity 
cannot  suitably  augment  Ha  imagery  in  the  flare  diagnosis  techniques  developed  and  used  in  this 
study. 

8.  SUMMARY  AND  CONCLUSIONS 

The  current  study  period  was  a  sequel  to  the  original  study  of  the  diagnosis  of  flare  probability 
that  began  in  mid-2009.  The  focus  of  the  entire  effort  was  to  see  if  high-cadence  Ha  imagery 
could  be  used  to  detect,  and  ultimately  predict,  solar  flares.  We  made  two  major  choices  of  tools 
to  investigate  this  possibility.  First,  we  subjected  the  selected  sequences  of  Ha  images  to 
Principal  Component  Analysis  to  derive  the  eigenvectors  and  eigenvalues  of  the  one-minute 
interval  grids  of  pixel  values.  Second,  we  chose  multivariate  discriminant  analysis  as  the  data- 
driven  statistical  technique  to  derive  relationships  between  predictors  (Ha  data  in  the  form  of 
eigenvectors)  and  predictands  (degree  of  flaring).  Ultimately  we  settled  on  four  degrees  of 
flaring,  or  flaring  level  indicators,  for  the  predictand  categories.  We  noticed  that  the  temporal 
patterns  of  the  leading  eigenvectors  from  the  selected  sequences  seemed  to  fall  into  these  four 
groups.  At  the  same  time,  the  four  groups  were  independently  corroborated  by  peak  x-ray  flux 
associated  with  the  corresponding  degree  of  flaring.  We  used  the  combination  of  leading 
eigenvector  patterns  and  peak  x-ray  flux  to  assign  each  of  the  selected  sequences  to  the  four 
flaring  categories.  It  then  remained  to  determine  the  most  effective  way  to  stage  the  associated 
predictor  vectors  in  order  to  optimize  diagnosis  of  flare  category  probability. 

In  this  study,  we  tried  two  methods  involving  individual  image  time  flare  category  probability 
diagnosis.  First,  we  employed  the  legacy  flare  category  probability  development  and  application 
algorithms  from  the  original  study.  This  used  the  straight  eigenvector  values  from  25-50  leading 
eigenvectors  as  the  elements  of  the  predictor  vectors  at  each  image  time.  Area-averaged  Ha  of 
the  sub-region  from  which  the  images  were  extracted,  along  with  x-ray  flux  at  the  same  one- 
minute  time  intervals  served  to  identify  the  rise  times  of  the  flare  if  present.  It  was  at  these  times 
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that  predictands  associated  with  the  prescribed  flaring  level  were  assigned.  Predictor  vectors 
paired  with  the  predictand  for  all  image  times  in  60  image  sequences  went  into  the  discriminant 
analysis  routine.  The  resulting  discriminant  vectors  were  applied  to  30  independent  image 
sequences  of  individual  image  time  predictor  vectors  derived  from  the  leading  eigenvectors.  The 
probability  of  each  of  the  four  flaring  categories  was  diagnosed  at  each  application  image  time. 
Comparing  them  with  the  prescribed  flaring  level  category  (non-zero  only  at  flare  rise  times) 
yielded  statistical  metrics  that  measured  the  diagnosis  performance. 

We  devised  a  second  individual  image  time  flare  category  probability  diagnosis  method  by 
modifying  the  first  one  in  two  major  ways.  First,  we  used  one-minute  changes  in  the  leading 
eigenvectors  as  predictors  in  place  of  the  full  eigenvector  values.  Second,  we  eliminated  use  of 
area-average  Ha  intensity  to  set  the  flare  rise  times,  relying  solely  on  x-ray  flux. 

In  terms  of  the  statistical  metrics  by  which  we  gauged  flare  category  probability  diagnosis 
performance,  the  two  individual  image  time  methods  performed  about  the  same.  The  probability- 
weighted  diagnosed  flare  category  at  each  time  was  compared  with  the  observed  flaring  category 
in  computing  Brier  Score  (essentially  mean  squared  difference),  Bias,  and  what  we  called 
Diagnosis  Uncertainty.  The  latter  was  the  average  over  all  image  times  of  one-minus-largest 
category  probability.  Both  methods  showed  a  consistent  positive  bias,  indicating  a  tendency  to 
produce  false  alarms.  In  some  non-flaring  sequences,  there  was  a  tendency  to  diagnose  an 
excessive  number  of  times  indicating  weak  flaring.  Also,  because  of  an  insufficient  degree  of 
discrimination  among  the  flaring  categories,  diagnosis  uncertainty  was  unacceptably  large, 
averaging  39  to  48  percent  depending  on  the  method  and  set  of  development  and  application 
image  sequences  used.  This  means  that  the  largest  of  the  four  category  probabilities  diagnosed 
was  typically  52  to  61  percent  -  not  nearly  as  definitive  as  would  be  desired.  Neither  of  the  two 
individual  image  time  methods  could  create  enough  distinction  among  the  four  flaring  categories 
to  avoid  frequent  misdiagnosis,  especially  of  flaring  when  the  sub-region  was  not  flaring. 

A  brief  analysis  of  why  some  sequences  were  diagnosed  well  when  others  weren’t  seemed  to 
indicate  that  there  was  significant  variation  of  the  eigenvector  patterns  within  each  flaring 
category.  This  is  a  common  problem  in  discriminant  analysis  -  excessive  within-group  scatter.  It 
is  not  clear  that  this  would  be  remedied  by  developing  the  technique  over  more  image  sequences. 
It  might  only  broaden  the  range  of  predictor  pattern  variations.  Another  issue  was  that  rises  and 
falls  in  the  eigenvector  curves  of  the  non-flaring  sequences  would  be  mistaken  for  flare-related 
inflections  when  the  flaring  category  probabilities  were  determined.  Again,  there  just  wasn’t 
enough  distance  in  discriminant  space  between  the  four  groups  to  clearly  delineate  the  flaring 
category  associated  with  an  independent  predictor  vector. 

We  tried  a  radically  different  approach  to  flare  category  diagnosis  by  involving  the  image 
sequence  as  a  whole  rather  than  trying  to  diagnose  at  individual  image  times.  We  represented  the 
patterns  of  the  leading  eigenvectors  through  frequency  distributions  of  their  one-minute  interval 
changes.  The  predictor  vector  elements  for  a  given  image  sequence  were  the  ten  bin  values  of 
frequency  of  occurrence  of  one-minute  change  for  eight  leading  eigenvectors.  We  used  the 
prescribed  flaring  category  as  the  predictand  for  the  image  sequence.  Again,  the  degree  of 
distinction  among  the  flaring  categories  fell  short.  The  collective  frequency  distributions  among 
the  four  flaring  categories  were  insufficiently  distinct  to  produce  a  clearly  indicated  most  likely 
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flaring  category.  In  fact,  the  diagnosis  uncertainty  was  even  greater  than  with  the  individual 
image  time  diagnosis  methods.  We  even  limited  specifying  predictor  elements  from  frequency  of 
occurrence  to  the  image  times  prior  to  the  flares,  since  the  differences  among  the  flaring 
categories  seemed  more  distinct  during  those  times.  But  this  resulted  in  fewer  image  times  from 
which  to  derive  the  frequency  distribution,  making  them  even  more  variable  among  the 
sequences,  so  the  technique  was  less  able  to  discriminate. 

Our  original  ultimate  goal  in  this  endeavor  was  to  be  able  to  predict  flaring  from  the  high- 
cadence  ISOON  data.  Unfortunately,  we  were  not  able  to  achieve  a  high  enough  level  of  flare 
diagnosis  skill  to  go  beyond  that  level.  We  still  feel  that  the  apparent  distinction  among  the 
flaring  categories  as  evident  in  the  Ha  eigenvectors  may  have  potential  for  useful  short-term 
flare  prediction.  However,  we  feel  that  we  have  reached  the  limit  in  achieving  that  goal  insofar  as 
applying  multivariate  discriminant  analysis.  In  the  future,  other  data-driven  statistical  techniques 
will  be  implemented  in  order  to  try  to  capitalize  on  that  potential. 
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